Code Tutorial

Code Tutorial#

import torch
import IPython.display as ipd
sr = 44100
duration = 5
audio_sample = torch.randn(1, sr * duration)
ipd.Audio(audio_sample.numpy(), rate=sr)

Stable Audio Open Tutorial#

Stable Audio Open is fully avaiable through HuggingFace. To run Stable Audio Open locally, you’ll first need to generate a $HF_TOKEN for yourself, which can be done here https://huggingface.co/docs/huggingface_hub/en/quick-start#authentication (which you will first need a HuggingFace account for). Once you generate the token, you should export it as an environment variable with a bash command like

export HF_TOKEN="YOUR_HF_TOKEN"

The rest of the tutorial very much follows the demo design of the public Stable Audio Open resources:

First, we’ll install some dependencies if you don’t already have them. Stable-Audio-Tools can be a bit finnicky to install directly, so we suggest making a dedicated virtual envinroment (and not conda) to run this notebook.

!pip install torch torchaudio torchvision stable-audio-tools einops
Requirement already satisfied: torch in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (2.4.1)
Requirement already satisfied: torchaudio in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (2.4.1)
Requirement already satisfied: torchvision in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (0.19.1)
Requirement already satisfied: stable-audio-tools in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (0.0.16)
Requirement already satisfied: einops in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (0.7.0)
Requirement already satisfied: filelock in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from torch) (3.16.0)
Requirement already satisfied: typing-extensions>=4.8.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from torch) (4.12.2)
Requirement already satisfied: sympy in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from torch) (1.13.2)
Requirement already satisfied: networkx in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from torch) (3.3)
Requirement already satisfied: jinja2 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from torch) (3.1.4)
Requirement already satisfied: fsspec in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from torch) (2024.10.0)
Requirement already satisfied: numpy in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from torchvision) (1.23.5)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from torchvision) (10.4.0)
Requirement already satisfied: aeiou==0.0.20 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (0.0.20)
Requirement already satisfied: alias-free-torch==0.0.6 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (0.0.6)
Requirement already satisfied: auraloss==0.4.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (0.4.0)
Requirement already satisfied: descript-audio-codec==1.0.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (1.0.0)
Requirement already satisfied: einops-exts==0.0.4 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (0.0.4)
Requirement already satisfied: ema-pytorch==0.2.3 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (0.2.3)
Requirement already satisfied: encodec==0.1.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (0.1.1)
Requirement already satisfied: gradio>=3.42.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (5.5.0)
Requirement already satisfied: huggingface-hub in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (0.26.0)
Requirement already satisfied: importlib-resources==5.12.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (5.12.0)
Requirement already satisfied: k-diffusion==0.1.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (0.1.1)
Requirement already satisfied: laion-clap==1.1.4 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (1.1.4)
Requirement already satisfied: local-attention==1.8.6 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (1.8.6)
Requirement already satisfied: pandas==2.0.2 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (2.0.2)
Requirement already satisfied: pedalboard==0.7.4 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (0.7.4)
Requirement already satisfied: prefigure==0.0.9 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (0.0.9)
Requirement already satisfied: pytorch-lightning==2.1.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (2.1.0)
Requirement already satisfied: PyWavelets==1.4.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (1.4.1)
Requirement already satisfied: safetensors in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (0.4.5)
Requirement already satisfied: sentencepiece==0.1.99 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (0.1.99)
Requirement already satisfied: s3fs in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (2024.10.0)
Requirement already satisfied: torchmetrics==0.11.4 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (0.11.4)
Requirement already satisfied: tqdm in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (4.66.5)
Requirement already satisfied: transformers in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (4.45.2)
Requirement already satisfied: v-diffusion-pytorch==0.0.2 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (0.0.2)
Requirement already satisfied: vector-quantize-pytorch==1.9.14 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (1.9.14)
Requirement already satisfied: wandb==0.15.4 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (0.15.4)
Requirement already satisfied: webdataset==0.2.48 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (0.2.48)
Requirement already satisfied: x-transformers<1.27.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stable-audio-tools) (1.26.6)
Requirement already satisfied: fastcore in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from aeiou==0.0.20->stable-audio-tools) (1.7.19)
Requirement already satisfied: plotly in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from aeiou==0.0.20->stable-audio-tools) (5.24.1)
Requirement already satisfied: bokeh in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from aeiou==0.0.20->stable-audio-tools) (3.6.1)
Requirement already satisfied: holoviews in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from aeiou==0.0.20->stable-audio-tools) (1.20.0)
Requirement already satisfied: scipy in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from aeiou==0.0.20->stable-audio-tools) (1.14.1)
Requirement already satisfied: matplotlib in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from aeiou==0.0.20->stable-audio-tools) (3.9.2)
Requirement already satisfied: librosa>=0.8.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from aeiou==0.0.20->stable-audio-tools) (0.9.2)
Requirement already satisfied: ipython in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from aeiou==0.0.20->stable-audio-tools) (8.27.0)
Requirement already satisfied: accelerate in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from aeiou==0.0.20->stable-audio-tools) (1.1.1)
Requirement already satisfied: soundfile<=0.10.2 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from aeiou==0.0.20->stable-audio-tools) (0.10.2)
Requirement already satisfied: umap-learn in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from aeiou==0.0.20->stable-audio-tools) (0.5.7)
Requirement already satisfied: argbind>=0.3.7 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from descript-audio-codec==1.0.0->stable-audio-tools) (0.3.9)
Requirement already satisfied: descript-audiotools>=0.7.2 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from descript-audio-codec==1.0.0->stable-audio-tools) (0.7.2)
Requirement already satisfied: clean-fid in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from k-diffusion==0.1.1->stable-audio-tools) (0.1.35)
Requirement already satisfied: clip-anytorch in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from k-diffusion==0.1.1->stable-audio-tools) (2.6.0)
Requirement already satisfied: dctorch in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from k-diffusion==0.1.1->stable-audio-tools) (0.1.2)
Requirement already satisfied: jsonmerge in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from k-diffusion==0.1.1->stable-audio-tools) (1.9.2)
Requirement already satisfied: kornia in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from k-diffusion==0.1.1->stable-audio-tools) (0.7.4)
Requirement already satisfied: scikit-image in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from k-diffusion==0.1.1->stable-audio-tools) (0.24.0)
Requirement already satisfied: torchdiffeq in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from k-diffusion==0.1.1->stable-audio-tools) (0.2.4)
Requirement already satisfied: torchsde in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from k-diffusion==0.1.1->stable-audio-tools) (0.2.6)
Requirement already satisfied: torchlibrosa in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from laion-clap==1.1.4->stable-audio-tools) (0.1.0)
Requirement already satisfied: ftfy in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from laion-clap==1.1.4->stable-audio-tools) (6.3.1)
Requirement already satisfied: braceexpand in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from laion-clap==1.1.4->stable-audio-tools) (0.1.7)
Requirement already satisfied: wget in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from laion-clap==1.1.4->stable-audio-tools) (3.2)
Requirement already satisfied: llvmlite in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from laion-clap==1.1.4->stable-audio-tools) (0.43.0)
Requirement already satisfied: scikit-learn in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from laion-clap==1.1.4->stable-audio-tools) (1.5.2)
Requirement already satisfied: h5py in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from laion-clap==1.1.4->stable-audio-tools) (3.12.1)
Requirement already satisfied: regex in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from laion-clap==1.1.4->stable-audio-tools) (2024.9.11)
Requirement already satisfied: progressbar in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from laion-clap==1.1.4->stable-audio-tools) (2.5)
Requirement already satisfied: python-dateutil>=2.8.2 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from pandas==2.0.2->stable-audio-tools) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from pandas==2.0.2->stable-audio-tools) (2024.2)
Requirement already satisfied: tzdata>=2022.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from pandas==2.0.2->stable-audio-tools) (2024.2)
Collecting argparse (from prefigure==0.0.9->stable-audio-tools)
  Using cached argparse-1.4.0-py2.py3-none-any.whl.metadata (2.8 kB)
Requirement already satisfied: configparser in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from prefigure==0.0.9->stable-audio-tools) (7.1.0)
Requirement already satisfied: gin-config in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from prefigure==0.0.9->stable-audio-tools) (0.5.0)
Requirement already satisfied: PyYAML>=5.4 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from pytorch-lightning==2.1.0->stable-audio-tools) (6.0.2)
Requirement already satisfied: packaging>=20.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from pytorch-lightning==2.1.0->stable-audio-tools) (24.1)
Requirement already satisfied: lightning-utilities>=0.8.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from pytorch-lightning==2.1.0->stable-audio-tools) (0.11.8)
Requirement already satisfied: requests in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from v-diffusion-pytorch==0.0.2->stable-audio-tools) (2.32.3)
Requirement already satisfied: Click!=8.0.0,>=7.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from wandb==0.15.4->stable-audio-tools) (8.1.7)
Requirement already satisfied: GitPython!=3.1.29,>=1.0.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from wandb==0.15.4->stable-audio-tools) (3.1.43)
Requirement already satisfied: psutil>=5.0.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from wandb==0.15.4->stable-audio-tools) (6.0.0)
Requirement already satisfied: sentry-sdk>=1.0.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from wandb==0.15.4->stable-audio-tools) (2.18.0)
Requirement already satisfied: docker-pycreds>=0.4.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from wandb==0.15.4->stable-audio-tools) (0.4.0)
Requirement already satisfied: pathtools in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from wandb==0.15.4->stable-audio-tools) (0.1.2)
Requirement already satisfied: setproctitle in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from wandb==0.15.4->stable-audio-tools) (1.3.3)
Requirement already satisfied: setuptools in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from wandb==0.15.4->stable-audio-tools) (72.1.0)
Requirement already satisfied: appdirs>=1.4.3 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from wandb==0.15.4->stable-audio-tools) (1.4.4)
Requirement already satisfied: protobuf!=4.21.0,<5,>=3.19.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from wandb==0.15.4->stable-audio-tools) (3.19.6)
Requirement already satisfied: aiofiles<24.0,>=22.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from gradio>=3.42.0->stable-audio-tools) (23.2.1)
Requirement already satisfied: anyio<5.0,>=3.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from gradio>=3.42.0->stable-audio-tools) (4.6.2.post1)
Requirement already satisfied: fastapi<1.0,>=0.115.2 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.115.4)
Requirement already satisfied: ffmpy in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.4.0)
Requirement already satisfied: gradio-client==1.4.2 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from gradio>=3.42.0->stable-audio-tools) (1.4.2)
Requirement already satisfied: httpx>=0.24.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.27.2)
Requirement already satisfied: markupsafe~=2.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from gradio>=3.42.0->stable-audio-tools) (2.1.5)
Requirement already satisfied: orjson~=3.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from gradio>=3.42.0->stable-audio-tools) (3.10.11)
Requirement already satisfied: pydantic>=2.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from gradio>=3.42.0->stable-audio-tools) (2.9.2)
Requirement already satisfied: pydub in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.25.1)
Requirement already satisfied: python-multipart==0.0.12 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.0.12)
Requirement already satisfied: ruff>=0.2.2 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.7.3)
Requirement already satisfied: safehttpx<1.0,>=0.1.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.1.1)
Requirement already satisfied: semantic-version~=2.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from gradio>=3.42.0->stable-audio-tools) (2.10.0)
Requirement already satisfied: starlette<1.0,>=0.40.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.41.2)
Requirement already satisfied: tomlkit==0.12.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.12.0)
Requirement already satisfied: typer<1.0,>=0.12 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.13.0)
Requirement already satisfied: uvicorn>=0.14.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.32.0)
Requirement already satisfied: websockets<13.0,>=10.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from gradio-client==1.4.2->gradio>=3.42.0->stable-audio-tools) (12.0)
Requirement already satisfied: aiobotocore<3.0.0,>=2.5.4 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from s3fs->stable-audio-tools) (2.15.2)
Requirement already satisfied: aiohttp!=4.0.0a0,!=4.0.0a1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from s3fs->stable-audio-tools) (3.10.10)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from sympy->torch) (1.3.0)
Requirement already satisfied: tokenizers<0.21,>=0.20 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from transformers->stable-audio-tools) (0.20.1)
Requirement already satisfied: botocore<1.35.37,>=1.35.16 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from aiobotocore<3.0.0,>=2.5.4->s3fs->stable-audio-tools) (1.35.36)
Requirement already satisfied: wrapt<2.0.0,>=1.10.10 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from aiobotocore<3.0.0,>=2.5.4->s3fs->stable-audio-tools) (1.16.0)
Requirement already satisfied: aioitertools<1.0.0,>=0.5.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from aiobotocore<3.0.0,>=2.5.4->s3fs->stable-audio-tools) (0.12.0)
Requirement already satisfied: aiohappyeyeballs>=2.3.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs->stable-audio-tools) (2.4.3)
Requirement already satisfied: aiosignal>=1.1.2 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs->stable-audio-tools) (1.3.1)
Requirement already satisfied: attrs>=17.3.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs->stable-audio-tools) (24.2.0)
Requirement already satisfied: frozenlist>=1.1.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs->stable-audio-tools) (1.4.1)
Requirement already satisfied: multidict<7.0,>=4.5 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs->stable-audio-tools) (6.1.0)
Requirement already satisfied: yarl<2.0,>=1.12.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs->stable-audio-tools) (1.15.4)
Requirement already satisfied: async-timeout<5.0,>=4.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs->stable-audio-tools) (4.0.3)
Requirement already satisfied: idna>=2.8 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from anyio<5.0,>=3.0->gradio>=3.42.0->stable-audio-tools) (3.8)
Requirement already satisfied: sniffio>=1.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from anyio<5.0,>=3.0->gradio>=3.42.0->stable-audio-tools) (1.3.1)
Requirement already satisfied: exceptiongroup>=1.0.2 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from anyio<5.0,>=3.0->gradio>=3.42.0->stable-audio-tools) (1.2.2)
Requirement already satisfied: docstring-parser in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from argbind>=0.3.7->descript-audio-codec==1.0.0->stable-audio-tools) (0.16)
Requirement already satisfied: pyloudnorm in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (0.1.1)
Requirement already satisfied: julius in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (0.2.7)
Requirement already satisfied: rich in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (13.9.4)
Requirement already satisfied: pystoi in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (0.4.1)
Requirement already satisfied: torch-stoi in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (0.2.3)
Requirement already satisfied: flatten-dict in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (0.4.2)
Requirement already satisfied: markdown2 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (2.5.1)
Requirement already satisfied: randomname in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (0.2.1)
Requirement already satisfied: tensorboard in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (2.18.0)
Requirement already satisfied: six>=1.4.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from docker-pycreds>=0.4.0->wandb==0.15.4->stable-audio-tools) (1.16.0)
Requirement already satisfied: gitdb<5,>=4.0.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from GitPython!=3.1.29,>=1.0.0->wandb==0.15.4->stable-audio-tools) (4.0.11)
Requirement already satisfied: certifi in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from httpx>=0.24.1->gradio>=3.42.0->stable-audio-tools) (2024.8.30)
Requirement already satisfied: httpcore==1.* in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from httpx>=0.24.1->gradio>=3.42.0->stable-audio-tools) (1.0.6)
Requirement already satisfied: h11<0.15,>=0.13 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from httpcore==1.*->httpx>=0.24.1->gradio>=3.42.0->stable-audio-tools) (0.14.0)
Requirement already satisfied: audioread>=2.1.9 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from librosa>=0.8.1->aeiou==0.0.20->stable-audio-tools) (3.0.1)
Requirement already satisfied: joblib>=0.14 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from librosa>=0.8.1->aeiou==0.0.20->stable-audio-tools) (1.4.2)
Requirement already satisfied: decorator>=4.0.10 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from librosa>=0.8.1->aeiou==0.0.20->stable-audio-tools) (5.1.1)
Requirement already satisfied: resampy>=0.2.2 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from librosa>=0.8.1->aeiou==0.0.20->stable-audio-tools) (0.4.3)
Requirement already satisfied: numba>=0.45.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from librosa>=0.8.1->aeiou==0.0.20->stable-audio-tools) (0.60.0)
Requirement already satisfied: pooch>=1.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from librosa>=0.8.1->aeiou==0.0.20->stable-audio-tools) (1.8.2)
Requirement already satisfied: annotated-types>=0.6.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from pydantic>=2.0->gradio>=3.42.0->stable-audio-tools) (0.7.0)
Requirement already satisfied: pydantic-core==2.23.4 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from pydantic>=2.0->gradio>=3.42.0->stable-audio-tools) (2.23.4)
Requirement already satisfied: charset-normalizer<4,>=2 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from requests->v-diffusion-pytorch==0.0.2->stable-audio-tools) (3.3.2)
Requirement already satisfied: urllib3<3,>=1.21.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from requests->v-diffusion-pytorch==0.0.2->stable-audio-tools) (2.2.2)
Requirement already satisfied: threadpoolctl>=3.1.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from scikit-learn->laion-clap==1.1.4->stable-audio-tools) (3.5.0)
Requirement already satisfied: cffi>=1.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from soundfile<=0.10.2->aeiou==0.0.20->stable-audio-tools) (1.17.1)
Requirement already satisfied: shellingham>=1.3.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from typer<1.0,>=0.12->gradio>=3.42.0->stable-audio-tools) (1.5.4)
Requirement already satisfied: contourpy>=1.2 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from bokeh->aeiou==0.0.20->stable-audio-tools) (1.3.0)
Requirement already satisfied: tornado>=6.2 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from bokeh->aeiou==0.0.20->stable-audio-tools) (6.4.1)
Requirement already satisfied: xyzservices>=2021.09.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from bokeh->aeiou==0.0.20->stable-audio-tools) (2024.9.0)
Requirement already satisfied: wcwidth in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from ftfy->laion-clap==1.1.4->stable-audio-tools) (0.2.13)
Requirement already satisfied: colorcet in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from holoviews->aeiou==0.0.20->stable-audio-tools) (3.1.0)
Requirement already satisfied: panel>=1.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from holoviews->aeiou==0.0.20->stable-audio-tools) (1.5.3)
Requirement already satisfied: param<3.0,>=2.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from holoviews->aeiou==0.0.20->stable-audio-tools) (2.1.1)
Requirement already satisfied: pyviz-comms>=2.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from holoviews->aeiou==0.0.20->stable-audio-tools) (3.0.3)
Requirement already satisfied: jedi>=0.16 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from ipython->aeiou==0.0.20->stable-audio-tools) (0.19.1)
Requirement already satisfied: matplotlib-inline in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from ipython->aeiou==0.0.20->stable-audio-tools) (0.1.7)
Requirement already satisfied: prompt-toolkit<3.1.0,>=3.0.41 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from ipython->aeiou==0.0.20->stable-audio-tools) (3.0.47)
Requirement already satisfied: pygments>=2.4.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from ipython->aeiou==0.0.20->stable-audio-tools) (2.18.0)
Requirement already satisfied: stack-data in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from ipython->aeiou==0.0.20->stable-audio-tools) (0.6.3)
Requirement already satisfied: traitlets>=5.13.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from ipython->aeiou==0.0.20->stable-audio-tools) (5.14.3)
Requirement already satisfied: pexpect>4.3 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from ipython->aeiou==0.0.20->stable-audio-tools) (4.9.0)
Requirement already satisfied: jsonschema>2.4.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from jsonmerge->k-diffusion==0.1.1->stable-audio-tools) (4.23.0)
Requirement already satisfied: kornia-rs>=0.1.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from kornia->k-diffusion==0.1.1->stable-audio-tools) (0.1.7)
Requirement already satisfied: cycler>=0.10 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from matplotlib->aeiou==0.0.20->stable-audio-tools) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from matplotlib->aeiou==0.0.20->stable-audio-tools) (4.53.1)
Requirement already satisfied: kiwisolver>=1.3.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from matplotlib->aeiou==0.0.20->stable-audio-tools) (1.4.7)
Requirement already satisfied: pyparsing>=2.3.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from matplotlib->aeiou==0.0.20->stable-audio-tools) (3.1.4)
Requirement already satisfied: tenacity>=6.2.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from plotly->aeiou==0.0.20->stable-audio-tools) (9.0.0)
Requirement already satisfied: imageio>=2.33 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from scikit-image->k-diffusion==0.1.1->stable-audio-tools) (2.36.0)
Requirement already satisfied: tifffile>=2022.8.12 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from scikit-image->k-diffusion==0.1.1->stable-audio-tools) (2024.9.20)
Requirement already satisfied: lazy-loader>=0.4 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from scikit-image->k-diffusion==0.1.1->stable-audio-tools) (0.4)
Requirement already satisfied: trampoline>=0.1.2 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from torchsde->k-diffusion==0.1.1->stable-audio-tools) (0.1.2)
Requirement already satisfied: pynndescent>=0.5 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from umap-learn->aeiou==0.0.20->stable-audio-tools) (0.5.13)
Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from botocore<1.35.37,>=1.35.16->aiobotocore<3.0.0,>=2.5.4->s3fs->stable-audio-tools) (1.0.1)
Requirement already satisfied: pycparser in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from cffi>=1.0->soundfile<=0.10.2->aeiou==0.0.20->stable-audio-tools) (2.22)
Requirement already satisfied: smmap<6,>=3.0.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from gitdb<5,>=4.0.1->GitPython!=3.1.29,>=1.0.0->wandb==0.15.4->stable-audio-tools) (5.0.1)
Requirement already satisfied: parso<0.9.0,>=0.8.3 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from jedi>=0.16->ipython->aeiou==0.0.20->stable-audio-tools) (0.8.4)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from jsonschema>2.4.0->jsonmerge->k-diffusion==0.1.1->stable-audio-tools) (2023.12.1)
Requirement already satisfied: referencing>=0.28.4 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from jsonschema>2.4.0->jsonmerge->k-diffusion==0.1.1->stable-audio-tools) (0.35.1)
Requirement already satisfied: rpds-py>=0.7.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from jsonschema>2.4.0->jsonmerge->k-diffusion==0.1.1->stable-audio-tools) (0.20.0)
Requirement already satisfied: bleach in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from panel>=1.0->holoviews->aeiou==0.0.20->stable-audio-tools) (6.2.0)
Requirement already satisfied: linkify-it-py in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from panel>=1.0->holoviews->aeiou==0.0.20->stable-audio-tools) (2.0.3)
Requirement already satisfied: markdown in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from panel>=1.0->holoviews->aeiou==0.0.20->stable-audio-tools) (3.7)
Requirement already satisfied: markdown-it-py in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from panel>=1.0->holoviews->aeiou==0.0.20->stable-audio-tools) (3.0.0)
Requirement already satisfied: mdit-py-plugins in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from panel>=1.0->holoviews->aeiou==0.0.20->stable-audio-tools) (0.4.2)
Requirement already satisfied: ptyprocess>=0.5 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from pexpect>4.3->ipython->aeiou==0.0.20->stable-audio-tools) (0.7.0)
Requirement already satisfied: platformdirs>=2.5.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from pooch>=1.0->librosa>=0.8.1->aeiou==0.0.20->stable-audio-tools) (4.3.2)
Requirement already satisfied: propcache>=0.2.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from yarl<2.0,>=1.12.0->aiohttp!=4.0.0a0,!=4.0.0a1->s3fs->stable-audio-tools) (0.2.0)
Requirement already satisfied: future>=0.16.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from pyloudnorm->descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (1.0.0)
Requirement already satisfied: fire in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from randomname->descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (0.7.0)
Requirement already satisfied: executing>=1.2.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stack-data->ipython->aeiou==0.0.20->stable-audio-tools) (2.1.0)
Requirement already satisfied: asttokens>=2.1.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stack-data->ipython->aeiou==0.0.20->stable-audio-tools) (2.4.1)
Requirement already satisfied: pure-eval in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from stack-data->ipython->aeiou==0.0.20->stable-audio-tools) (0.2.3)
Requirement already satisfied: absl-py>=0.4 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from tensorboard->descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (2.1.0)
Requirement already satisfied: grpcio>=1.48.2 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from tensorboard->descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (1.67.1)
Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from tensorboard->descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (0.7.2)
Requirement already satisfied: werkzeug>=1.0.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from tensorboard->descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (3.1.3)
Requirement already satisfied: mdurl~=0.1 in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from markdown-it-py->panel>=1.0->holoviews->aeiou==0.0.20->stable-audio-tools) (0.1.2)
Requirement already satisfied: webencodings in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from bleach->panel>=1.0->holoviews->aeiou==0.0.20->stable-audio-tools) (0.5.1)
Requirement already satisfied: termcolor in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from fire->randomname->descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (2.5.0)
Requirement already satisfied: uc-micro-py in /Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages (from linkify-it-py->panel>=1.0->holoviews->aeiou==0.0.20->stable-audio-tools) (1.0.3)
Using cached argparse-1.4.0-py2.py3-none-any.whl (23 kB)
Installing collected packages: argparse
Successfully installed argparse-1.4.0

If running this locally, you can simply set the HF_TOKEN in your local environment (as done below). If you’re using a collab notebook, you first need to upload your HF_TOKEN as a “secret key” to your collab, and the below command won’t have any affect in that case.

import os
os.environ['HF_TOKEN'] = 'Your API key'

Next, we can load the model from huggingface. Note that there are some known dependency issues with stable-audio-tools on M1 Macs, so we recommend running this as a collab notebook (or on some linux system)

import torch
import torchaudio
# import librosa
from einops import rearrange
from stable_audio_tools import get_pretrained_model
from stable_audio_tools.inference.generation import generate_diffusion_cond
import IPython.display as ipd
from functools import partial

device = "cuda" if torch.cuda.is_available() else "cpu"

# Download model
model, model_config = get_pretrained_model("stabilityai/stable-audio-open-1.0")
sample_rate = model_config["sample_rate"]
sample_size = model_config["sample_size"]

model = model.to(device)
/Users/seungheond/anaconda3/envs/p310/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
File ~/anaconda3/envs/p310/lib/python3.10/site-packages/huggingface_hub/utils/_http.py:406, in hf_raise_for_status(response, endpoint_name)
    405 try:
--> 406     response.raise_for_status()
    407 except HTTPError as e:

File ~/anaconda3/envs/p310/lib/python3.10/site-packages/requests/models.py:1024, in Response.raise_for_status(self)
   1023 if http_error_msg:
-> 1024     raise HTTPError(http_error_msg, response=self)

HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/stabilityai/stable-audio-open-1.0/resolve/main/model_config.json

The above exception was the direct cause of the following exception:

GatedRepoError                            Traceback (most recent call last)
Cell In[4], line 13
     10 device = "cuda" if torch.cuda.is_available() else "cpu"
     12 # Download model
---> 13 model, model_config = get_pretrained_model("stabilityai/stable-audio-open-1.0")
     14 sample_rate = model_config["sample_rate"]
     15 sample_size = model_config["sample_size"]

File ~/anaconda3/envs/p310/lib/python3.10/site-packages/stable_audio_tools/models/pretrained.py:10, in get_pretrained_model(name)
      8 def get_pretrained_model(name: str):
---> 10     model_config_path = hf_hub_download(name, filename="model_config.json", repo_type='model')
     12     with open(model_config_path) as f:
     13         model_config = json.load(f)

File ~/anaconda3/envs/p310/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py:114, in validate_hf_hub_args.<locals>._inner_fn(*args, **kwargs)
    111 if check_use_auth_token:
    112     kwargs = smoothly_deprecate_use_auth_token(fn_name=fn.__name__, has_token=has_token, kwargs=kwargs)
--> 114 return fn(*args, **kwargs)

File ~/anaconda3/envs/p310/lib/python3.10/site-packages/huggingface_hub/file_download.py:862, in hf_hub_download(repo_id, filename, subfolder, repo_type, revision, library_name, library_version, cache_dir, local_dir, user_agent, force_download, proxies, etag_timeout, token, local_files_only, headers, endpoint, resume_download, force_filename, local_dir_use_symlinks)
    842     return _hf_hub_download_to_local_dir(
    843         # Destination
    844         local_dir=local_dir,
   (...)
    859         local_files_only=local_files_only,
    860     )
    861 else:
--> 862     return _hf_hub_download_to_cache_dir(
    863         # Destination
    864         cache_dir=cache_dir,
    865         # File info
    866         repo_id=repo_id,
    867         filename=filename,
    868         repo_type=repo_type,
    869         revision=revision,
    870         # HTTP info
    871         endpoint=endpoint,
    872         etag_timeout=etag_timeout,
    873         headers=headers,
    874         proxies=proxies,
    875         token=token,
    876         # Additional options
    877         local_files_only=local_files_only,
    878         force_download=force_download,
    879     )

File ~/anaconda3/envs/p310/lib/python3.10/site-packages/huggingface_hub/file_download.py:969, in _hf_hub_download_to_cache_dir(cache_dir, repo_id, filename, repo_type, revision, endpoint, etag_timeout, headers, proxies, token, local_files_only, force_download)
    966                 return pointer_path
    968     # Otherwise, raise appropriate error
--> 969     _raise_on_head_call_error(head_call_error, force_download, local_files_only)
    971 # From now on, etag, commit_hash, url and size are not None.
    972 assert etag is not None, "etag must have been retrieved from server"

File ~/anaconda3/envs/p310/lib/python3.10/site-packages/huggingface_hub/file_download.py:1484, in _raise_on_head_call_error(head_call_error, force_download, local_files_only)
   1478     raise LocalEntryNotFoundError(
   1479         "Cannot find the requested files in the disk cache and outgoing traffic has been disabled. To enable"
   1480         " hf.co look-ups and downloads online, set 'local_files_only' to False."
   1481     )
   1482 elif isinstance(head_call_error, RepositoryNotFoundError) or isinstance(head_call_error, GatedRepoError):
   1483     # Repo not found or gated => let's raise the actual error
-> 1484     raise head_call_error
   1485 else:
   1486     # Otherwise: most likely a connection issue or Hub downtime => let's warn the user
   1487     raise LocalEntryNotFoundError(
   1488         "An error happened while trying to locate the file on the Hub and we cannot find the requested files"
   1489         " in the local cache. Please check your connection and try again or make sure your Internet connection"
   1490         " is on."
   1491     ) from head_call_error

File ~/anaconda3/envs/p310/lib/python3.10/site-packages/huggingface_hub/file_download.py:1376, in _get_metadata_or_catch_error(repo_id, filename, repo_type, revision, endpoint, proxies, etag_timeout, headers, token, local_files_only, relative_filename, storage_folder)
   1374 try:
   1375     try:
-> 1376         metadata = get_hf_file_metadata(
   1377             url=url, proxies=proxies, timeout=etag_timeout, headers=headers, token=token
   1378         )
   1379     except EntryNotFoundError as http_error:
   1380         if storage_folder is not None and relative_filename is not None:
   1381             # Cache the non-existence of the file

File ~/anaconda3/envs/p310/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py:114, in validate_hf_hub_args.<locals>._inner_fn(*args, **kwargs)
    111 if check_use_auth_token:
    112     kwargs = smoothly_deprecate_use_auth_token(fn_name=fn.__name__, has_token=has_token, kwargs=kwargs)
--> 114 return fn(*args, **kwargs)

File ~/anaconda3/envs/p310/lib/python3.10/site-packages/huggingface_hub/file_download.py:1296, in get_hf_file_metadata(url, token, proxies, timeout, library_name, library_version, user_agent, headers)
   1293 headers["Accept-Encoding"] = "identity"  # prevent any compression => we want to know the real size of the file
   1295 # Retrieve metadata
-> 1296 r = _request_wrapper(
   1297     method="HEAD",
   1298     url=url,
   1299     headers=headers,
   1300     allow_redirects=False,
   1301     follow_relative_redirects=True,
   1302     proxies=proxies,
   1303     timeout=timeout,
   1304 )
   1305 hf_raise_for_status(r)
   1307 # Return

File ~/anaconda3/envs/p310/lib/python3.10/site-packages/huggingface_hub/file_download.py:277, in _request_wrapper(method, url, follow_relative_redirects, **params)
    275 # Recursively follow relative redirects
    276 if follow_relative_redirects:
--> 277     response = _request_wrapper(
    278         method=method,
    279         url=url,
    280         follow_relative_redirects=False,
    281         **params,
    282     )
    284     # If redirection, we redirect only relative paths.
    285     # This is useful in case of a renamed repository.
    286     if 300 <= response.status_code <= 399:

File ~/anaconda3/envs/p310/lib/python3.10/site-packages/huggingface_hub/file_download.py:301, in _request_wrapper(method, url, follow_relative_redirects, **params)
    299 # Perform request and return if status_code is not in the retry list.
    300 response = get_session().request(method=method, url=url, **params)
--> 301 hf_raise_for_status(response)
    302 return response

File ~/anaconda3/envs/p310/lib/python3.10/site-packages/huggingface_hub/utils/_http.py:423, in hf_raise_for_status(response, endpoint_name)
    419 elif error_code == "GatedRepo":
    420     message = (
    421         f"{response.status_code} Client Error." + "\n\n" + f"Cannot access gated repo for url {response.url}."
    422     )
--> 423     raise _format(GatedRepoError, message, response) from e
    425 elif error_message == "Access to this resource is disabled.":
    426     message = (
    427         f"{response.status_code} Client Error."
    428         + "\n\n"
   (...)
    431         + "Access to this resource is disabled."
    432     )

GatedRepoError: 401 Client Error. (Request ID: Root=1-6730f40e-0035d0002276c8da3c048b22;8b4d2fe6-7b98-4702-a05b-cd7565572fb1)

Cannot access gated repo for url https://huggingface.co/stabilityai/stable-audio-open-1.0/resolve/main/model_config.json.
Access to model stabilityai/stable-audio-open-1.0 is restricted. You must have access to it and be authenticated to access it. Please log in.

First we’ll wrap the sampling code in a simpler wrapper, as there’s a few parameters that need to be provided but are not strictly useful to play around with.

# this just cleans things up a bit so the code below highlights the important knobs
easy_generate = partial(generate_diffusion_cond, sample_size=sample_size, sigma_min=0.3, sigma_max=500, device=device)

Next we can define our conditioning, which for the default Stable Audio Open involves text, timing, and overall length.

# Set up text and timing conditioning
conditioning = [{
    "prompt": "clean guitar, sweep picking, 140 bpm, G minor",
    "seconds_start": 0, # this says "where" in time the sample is in the song,
    "seconds_total": 30 # total sample length in seconds, rest gets padded with silency
}]
seed = 1000
n_steps = 50
cfg = 7.5
sampler = "dpmpp-3m-sde"

output = easy_generate(
    model,
    conditioning=conditioning,
    steps=n_steps, # number of diffusion steps to run
    cfg_scale=cfg, # classifier free guidance guidance scale
    sampler_type=sampler, # sampling "algorithm", check out https://github.com/Stability-AI/stable-audio-tools/blob/main/stable_audio_tools/inference/sampling.py#L177 for more options
    seed=seed,
)

# Rearrange audio batch to a single sequence
output = rearrange(output, "b d n -> d (b n)")

# Peak normalize, clip, convert to int16, and save to file
output = output.to(torch.float32).div(torch.max(torch.abs(output))).clamp(-1, 1).mul(32767).to(torch.int16).cpu()[:, :round(conditioning[0]['seconds_total']*sample_rate)]

Now we can listen to the output! Note: if running on a collab notebook, rendering audio will stop the autosave feature, so be sure to delete the block outputs if you want to turn this back on!

ipd.display(ipd.Audio(output, rate=sample_rate))